Skip to content

Conversation

@pinin4fjords
Copy link
Member

PR checklist

  • This comment contains a description of changes (with reason).
  • If you've fixed a bug or added code that should be tested, add tests!
  • If you've added a new tool - have you followed the pipeline conventions in the contribution docs
  • If necessary, also make a PR on the nf-core/rnaseq branch on the nf-core/test-datasets repository.
  • Make sure your code lints (nf-core lint).
  • Ensure the test suite passes (nextflow run . -profile test,docker --outdir <OUTDIR>).
  • Check for unexpected warnings in debug mode (nextflow run . -profile debug,test,docker --outdir <OUTDIR>).
  • Usage Documentation in docs/usage.md is updated.
  • Output Documentation in docs/output.md is updated.
  • CHANGELOG.md is updated.
  • README.md is updated (including new tool citations and authors/contributors).

Description

Fixes #1445

This PR updates the tximeta/tximport module to include the fix from nf-core/modules PR nf-core/modules#9407.

The Problem

Users reported that the pipeline was failing with the error:

Error in findColumnWithAllEntries(ids, metadata) : 
  No column contains all vector entries

This occurred when sample names:

  • Started with numbers (e.g., 1A2, 5B2)
  • Contained special characters like hyphens (e.g., sample-1, D10-D)

Root Cause

R's data.frame() function automatically modifies column/row names when check.names=TRUE (the default):

  • Sample names starting with numbers get an "X" prepended: 1A2X1A2
  • Hyphens get converted to dots: D10-DD10.D

While PR #1380 partially fixed this issue in v3.15.1 by adding check.names = FALSE to some functions, it missed the critical line in the tximport.r script where coldata is created. This directly affected sample names that became column names in output matrices, causing mismatches when the downstream summarizedexperiment process tried to match them against the original samplesheet metadata.

Solution

Updated the tximeta/tximport module to the latest version which adds check.names = FALSE to three function calls:

  1. read.csv() when reading transcript info (line 76)
  2. data.frame() when creating extra transcript info rows (line 79)
  3. data.frame() when creating coldata (line 134) - the main fix

Testing

Users can now safely use sample names with:

  • Leading numbers
  • Hyphens
  • Other special characters that are valid in file names

The fix preserves sample names exactly as provided in the input samplesheet.

Fixes #1445

Updates the tximeta/tximport module to include the fix from
nf-core/modules PR #9407, which adds check.names=FALSE to
data.frame() calls to prevent R from modifying sample names.

This resolves an issue where sample names starting with numbers
or containing special characters were being modified, causing
downstream errors in the SUMMARIZEDEXPERIMENT process when trying
to match sample IDs between count matrices and samplesheet metadata.
@github-actions
Copy link

github-actions bot commented Nov 14, 2025

nf-core pipelines lint overall result: Passed ✅ ⚠️

Posted for pipeline commit 6f4518c

+| ✅ 286 tests passed       |+
#| ❔   7 tests were ignored |#
!| ❗   9 tests had warnings |!

❗ Test warnings:

  • files_exist - File not found: assets/multiqc_config.yml
  • pipeline_todos - TODO string in base.config: Check the defaults for all processes
  • pipeline_todos - TODO string in awsfulltest.yml: You can customise AWS full pipeline tests as required
  • pipeline_todos - TODO string in methods_description_template.yml: #Update the HTML below to your preferred methods description, e.g. add publication citation for this pipeline
  • pipeline_todos - TODO string in main.nf: Optionally add in-text citation tools to this list.
  • pipeline_todos - TODO string in main.nf: Optionally add bibliographic entries to this list.
  • pipeline_todos - TODO string in main.nf: Only uncomment below if logic in toolCitationText/toolBibliographyText has been filled!
  • pipeline_todos - TODO string in nextflow.config: Specify any additional parameters here
  • pipeline_if_empty_null - ifEmpty(null) found in /home/runner/work/rnaseq/rnaseq/subworkflows/local/prepare_genome/main.nf: _ versions = ch_versions.ifEmpty(null) // channel: [ versions.yml ]
    _

❔ Tests ignored:

✅ Tests passed:

Run details

  • nf-core/tools version 3.4.1
  • Run at 2025-11-14 15:45:00

- '"quants/*"':
type: directory
description: Directory containing quantification files
- quants/*: {}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happened here?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Error in findColumnWithAllEntries(ids, metadata) : No column contains all vector entries

3 participants